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^ Abstract 

^ We study the number of random records in an arbitrary split tree 

•^ (or equivalently, the number of random cuttings required to eliminate 

^^ the tree). We show that a classical limit theorem for convergence of 

I— I sums of triangular arrays to infinitely divisible distributions can be 

Ph used to determine the distribution of this number. After normalization 

'"^ the distributions are shown to be asymptotically weakly 1 -stable. This 

1^ work is a generalization of our earlier results for the random binary 

Cd search tree in UU^ . which is one specific case of split trees. Other 

Ch important examples of split trees include m-ary search trees, quadtrees, 

medians of {2k + l)-trees, simplex trees, tries and digital search trees. 

> 

^ 1 Introduction 

in 

. 1.1 Preliminaries 

in 

^ We study the number of records in random split trees which were introduced 

by Devroye [Sj. As shown by Janson y^, this number is equivalent (in 
distribution) to the number of cuts needed to eliminate this type of tree. 

Given a rooted tree T, let each vertex v have a random value A^, attached 
to it, and assume that these values are i.i.d. with a continuous distribution. 
We say that the value A^ is a record if it is the smallest value in the path 
from the root to v. Let Xy{T) denote the (random) number of records. 
Alternatively one may attach random variables to the edges and let Xe{T) 
denote the number of edges with record values. Only the order relations of 
the A„'s are important, so the distribution of A^ does not matter, i.e., one 
can choose any continuous distribution for A^,. 

The same random variables appear when we consider cuttings of the tree 
T as introduced by Meir and Moon [19] with the following definition. Make 



O 
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a random cut by choosing one vertex [respectively edge] at random. Delete 
this vertex [respectively edge] so that the tree separates into several parts and 
keep only the part containing the root. Continue recursively until the root 
is cut [respectively only the root is left]. Then the total (random) number 
of cuts made is Xy(T) [respectively Xe(T)]. More precisely, cuttings and 
records give random variables with the same distribution. The proof of this 
equivalence uses a natural coupling argument as shown in [HI [13] . 

In [13] the asymptotic distributions for the number of cuts (or the number 
of records) are found for random trees that can be constructed as conditioned 
Galton- Watson trees, e.g., labelled trees and random binary trees. There the 
proof relies on the fact that the method of moments can be used. 

For the deterministic (non random) complete binary tree it is, however, 
not possible to use the method of moments. To deal with this Janson [13] 
introduced another strategy, which is to approximate Xy{T) by a sum of 
independent random variables derived from A„, and then apply a classical 
limit theorem for triangular arrays, see e.g., [T6| Theorem 15.28]. We recently 
showed that Janson's approach could also be applied to the random binary 
search tree [TU] . 

In this paper we consider all types of (random) split trees defined by 
Devroye [3]; the binary search tree that we consider in [10] is one example of 
such trees. Some other important examples of split trees are m-ary search 
trees, quadtrees, median of {2k + l)-trees, simplex trees, tries and digital 
search trees. The split trees belong to the family of so-called log n trees, that 
are trees with height (maximal depth) a.a.s. 0{logn). (For the notation 
a.a.s. see tl5j.) These have similar properties to the deterministic complete 
binary tree with height [log2nJ considered in [13]. In the complete binary 
tree (with high probability) most vertices are close to [log2 n\ (the height 
of the tree). In split trees on the other hand (with high probability) most 
vertices are close to depth ~ c In n, where c is a constant (it is natural to use 
the e-logarithm) ; for the binary search tree that we investigated in [lOj this 
depth is ~ 2 Inn (e.g., [^). Here by the use of renewal theory we extend the 
methods used in [TO] for the specific case of the binary search tree to show 
that also for split trees in general it is possible to apply a limit theorem, see 
e.g., fTS[ Theorem 15.28], for convergence of sums of triangular arrays to 
infinitely divisible distributions to determine the asymptotic distribution of 
Xy{T). 

The split tree generating algorithm: 

The formal, comprehensive "split tree generating algorithm" is as follows 
with the following introductory notation, see [3] and [TT]. A split tree is a 
finite subtree of a skeleton tree Sb (i.e., an infinite rooted tree in which each 
vertex has exactly b children that are numbered 1, 2, . . . , 6). The split tree 



b=4 s_0=l 

s=3 s 1=0 



All internal vertices have s_0=l balls 




All leaves have between 1 and s=3 balls. 
Note that s_l=0. 



Figure 1: This figure illustrates a split tree with parameters 6 = 4, s = 3, sq = 1 
and si = 0. 



All internal vertices have s 0=0 balls 




All leaves have between 2 and s=4 balls. 
Note that s 1 is at most 2. 



Figure 2: This figure illustrates a split tree with parameters b 
and si = 2. 
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is constructed recursively by distributing balls one at a time to generate a 
subset of vertices of Sb. We say that the tree has cardinality n, if n balls are 
distributed. There is also a so-called vertex capacity, s > 0, which means 
that each node can hold at most s balls. Each vertex v of Sb is given an 
independent copy of the so-called random split vector V = (Vi, V2 . . . , H) of 
probabilities, where Y2^ Vi = 1, V^ > 0. There are also two other parameters: 
So, Si (related to the parameter s) that occur in the algorithm below; see 
Figure [T] and Figure [2} where two examples of split trees are illustrated. Let 
Uy denote the total number of balls that the vertices in the subtree rooted 
at vertex v hold together, and C^ be the number of balls that are held by 
V itself. We say that a vertex w is a leaf in a split tree if the node itself 
holds at least one ball but no descendants of v hold any balls. An equivalent 



definition of a leaf is to say that f is a leaf if and only if C^ 



n. 



> 0. A 



vertex v E Sb is included in the split tree if, and only if, n^, > 0; if rij, = 0, 
the vertex v is not included and it is called useless. 

Below there is a description of the algorithm which determines how the 
n balls are distributed over the vertices. Initially there are no balls, i.e., 
Cy = for each vertex v. Choose an independent copy Vv of V for every 
vertex v G Sb- Add balls one by one to the root by the following recursive 
procedure for adding a ball to the subtree rooted at v. 

1. If f is not a leaf, choose child i with probability Vi and recursively add 
the ball to the subtree rooted at child i, by the rules given in steps [ij 
|2]and[3l 

2. If f is a leaf and C^ = riv < s, then add the ball to v and stop. Thus, 
Cv and rijj increase by 1. 

3. If t> is a leaf and C^ = Uy = s, the ball cannot be placed at v since it 
is occupied by the maximal number of balls it can hold. In this case, 
let n„ = s + 1 and C^ = sq, by placing sq < s randomly chosen balls 
at V and s + 1 — Sq balls at its children. This is done by first giving 
Si randomly chosen balls to each of the b children. The remaining 
s + 1 — So — bsi balls are placed by choosing a child for each ball 
independently according to the probability vector Vv = (Vi, V2 . . . , H), 
and then using the algorithm described in steps [TJ [2] and [3] applied to 
the subtree rooted at the selected child. 

From [3] it follows that the integers sq and si have to satisfy the inequality 

< So < S, <bSi < S + 1 — Sq. 

We can assume that the components Vi of the split vector V are identically 
distributed. If this were not the case they can anyway be made identically 
distributed by using a random permutation as explained in |3]. Let V be 
a random variable with this distribution. This gives (because Y2i ^ = 1) 
that E(y) = |. We use the notation T" to denote a split tree with n balls. 
However, note that even given the fact that the split tree has n balls, the 
number of nodes A^, is still a random number. The only parameters that are 
important in this work (and in general these parameters are the important 
ones for most results concerning split trees) are the cardinality n, the branch 



factor b and the split vector V; this is illustrated in Section 1.4.1 In a binary 
search tree b = 2, the split vector V = (V^i, V2) is distributed as (f/, 1 — U) 
where U is a. uniform U{0, 1) random variable. For the binary search tree 
the number of balls n is the same number as the number of vertices A^; this 
is not true for split trees in general. 



1.2 Some Important Facts and Results for Split Trees 

1.2.1 Results Concerning Depth Analysis 

In [31 Theorem 1] Devroye presents a strong law and a central limit law for the 
depth Dn of the last inserted ball in a split tree with n balls and split vector 
V. Recall that V is distributed as the identically distributed components in 
the split vector. Let 

^i:=b'E(-V\n{y)\ 
a^ ■.= h-E(yhi^v\-^i\ (1) 

If V{V = 1) = and V{V = 0) < 1, then 

^ ^1 \ (2) 



and 
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^ /i'^ (3) 



Furthermore, if a > 0, then 



^7i£^4iv(o,i), (4) 

where A^(0, 1) denotes the standard Normal distribution and — > denotes con- 
vergence in distribution. Assuming that a > is equivalent to assuming that 
V is not monoatomic, i.e., it is not the case that V = \. 

Let Dk be the depth of the /c-th ball. In [HI Theorem 2.3] by using the 
same assumptions for V as Devroye uses for proving (pi) and Q we also show 
results concerning the variance of depths, i.e., for all j^ "£ k < n, 

VarjDk) 0-3 

— ?■ a /i . 
inn 

1.2.2 Results Concerning the Number of Nodes 



(Al). In this work we assume as in Section 1.2.1 that 'Piy = 1) = 0, and 
as in [TT^ for simplicity we also assume that P(y = 0) = and that —InV 
is non-lattice. 



The non-lattice assumption we do since we use renewal theory for sums 
depending on the distribution of — In V, and in renewal theory it often be- 
comes necessary to distinguish between lattice and non-lattice distributions. 
Tries and digital search trees are special forms of split trees with a random 
permutation of deterministic components {pi,P2, ■ ■ ■ ,Pb) and therefore not 
as random as many other examples. Of the common split trees only for 
some special cases of tries and digital search trees (e.g., the symmetric ones 
Pi = P2 = ■ ■ ■ = Pb = I) does —InV have a lattice distribution. By assuming 
that (Al) holds we show in [TTl Theorem 2.1] that there is a constant a de- 
pending on the type of split tree such that for the random number of nodes 
A^ we have that 

E{N) = an + o{n); (5) 

and 

Var(Ar) = o{n^). (6) 

Let d{v) denote the depth of a node. In [11, Theorem 2.2] we show 
that the expected number of nodes in a tree with n balls, where d{v) < 
/i~^ Inn — In ' """^ n or d(v) > /i~^ Inn -|- In ' """^ n, for some arbitrary e > 0, is 

Oi r^ ), for any constant k. In this paper we use in particular that this is 

O i T^ j . In pT| Remark 4.3] we also note that for any constant r there is a 
constant C > such that the expected number of nodes with d{v) > Clnn 
is Oi^], hence, we can bound the number of vertices with "large" depths 
with very small error terms. 

1.2.3 Results Concerning the Total Path Length 

In the present study we consider the "total path length" of a tree T as the 
sum of all depths of the vertices in T (distances to the root). Since the split 
tree is a random tree the total path length is a random variable, which we 
denote by T(T). However, a more natural definition of the total path length 
is probably the sum of all depths of balls in T, which we denote by \E'(T). 
From (Is]) it follows that 

E(^^(T")') =/i-^nlnn + ng(n), (7) 

where q{n) = o(lnn) is a function that depends on the type of split tree. By 
using ([3]) and ^ we easily show in [IT] that 



E(^T(T")) = ij,-^an\nn + 



nrin) 



where ot is the constant that occurs in (Is]) and r{n) = o(lnn) is a function 
that depends on the type of spht tree. 

(A2). Assume that the functions q{n) in ^ converges to some constant q. 

In [20] there is an analogous assumption. Examples of split trees where it 
is shown that q{n) converges to a constant are binary search trees (e.g. [7]), 
random m-arj search trees [T7] , quad trees [20] and the random median of a 
{2k + l)-tree [21], tries and Patricia tries [2]. 

(A3). Assume that the result in ^ can be improved such that 

E{N) = an + f{n), 

where f{n) = o{j^^. 

Stronger second order terms of the size have previously been shown to 
hold e.g., for m-aiy search trees [^, for these f{n) in assumption (A3) is 

o{^/n) when m < 26 and is 0{n^~^ 1 when m > 27. Further, as described in 

Section [1.2.2 tries are special cases of split trees which are not as random as 



other types of split trees. Flajolet and Vallee (personal communication) have 
recently shown that also for most tries (as long as — In l^ is not too close to 
being lattice) assumption (A3) holds. 

In [11, Theorem 5.1] by assuming (A2) and (A3) we show that r(n) in 
(Is]) converges to some constant (. In [TTl Theorem 5.2] by applying [TTl 
Theorem 5.1] we show the following result, which we will apply in the proof 
of the main theorem below: Let L = [/31og(,lnnJ for some large constant (3, 
then 

V^ T(T,) ^ 4^ arij n( { ^ \ (q\ 

^/i-21n=^n, ^/i-iln^i /i-21n2n^ ^Vln^n^' ^ ' 

1=1 ^ 4 = 1 ' 

where C, is the constant that rin) converges to. 

1.3 The Main Theorem 

The main theorem of this study is presented below: 

Theorem 1.1. Let n — )■ oo, and suppose that assumptions (A1)-(A3) hold. 
Then 

/ yU ^ in n 



where 

an an In Inn Cn 

1 ^— , 

/i~Mn n /i-i In^ n fi'^ In^ n 



\J(-I 1/ \JCI 1/ 1.1.1. IIJ. I L S ^ / 1 -1 \ 



/or t/ie constant C in ^, and where W has an infinitely divisible distribu- 
tion. More precisely W has a weakly 1-stable distribution, with characteristic 
function 

^Utw\ =exp('-^7r|t|+zt('C-/i-Mn|t|)V (12) 

where jjl is the constant in pi) and a is the constant in (^ and C is a constant 
which is defined in (15) below. The same result holds for Xe(T"). 

Remark 1.1. Even if we only have E(A^) = an + o(n) as in ([S]) (i.e., ignoring 
the assumptions (A2)-(A3) the normahzed X„(T") (or Xe(T^)) ought to still 
converge to a weakly 1-stable distribution with characteristic function as in 



(12) for some constant C. However, in this case Cn in (jlOj) ought to be 
E(A^) ot:./ Y- E(iV,|n,)ln(^)\ E{N)L 



/i-ilnn V.rr^r /i"^ln^n / 



^ -„ .. , /i-Un n 

d{v)=L 

an In Inn -^f ST^ '^(Tj) 



/i ^ In n \ -^ /i ^In n^ 



n, 



where T(Tj) is the total path length for the nodes of the subtrees Tj rooted 
at depth L. 

The class of a-stable distributions are included in the larger class of in- 
finitely divisible distributions. The general formula for the characteristic 
function of an infinitely divisible distribution is 

exp(itb- — + {e'^'' - 1 - itxl[\x\ < l])diy{xU , (13) 

for constants a > 0, 6 G M and u is the so called Levy measure. The 
characteristic function in (13) of a 1-stable distribution (i.e., a = 1) can be 
simplified to 



exp I idt — c\t\ ( 1 + i/3— sign(t) In |t| 



for constants c > 0, /3 G [—1, 1] and (i G M. If the Levy measure z/ in (13) 
satisfies ^ = <Ja+i on M-t, for a G (0, 2) and constants c± the corresponding 
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infinitely divisible distribution is weakly a-stable. The most well-known 1- 
stable distribution is the Cauchy distribution. However, in contrast to the 
distribution of W in Theorem |1.1| (which is weakly 1-stable), the Cauchy 
distribution is strictly 1-stable and symmetric. The random variable W in 
Theorem |1.1 1 has support on (—00,00), and has a heavy tailed distribution. 
As for other random variables with a-stable distributions where a < 2 the 
variance of W is infinite. Also since a < 1 the expected value of W is 
not defined. For further information about stable distributions, see e.g., [51 
Section XVII.3]. 



Remark 1.2. In the proof of Theorem 1.1 we get 



E(e.»') ^ exp (.(C . .-,. - D) H- /"(e- - 1 - ..1, < !,)..(.)) . 

(14) 



where C is the constant in (12), 7 is the Euler constant and the Levy measure 



u is supported on (0, 00) and has density 

du ^~^ 
dx x^ 

Thus, we see that W has a weakly 1-stable distribution. The constant C can 
be expressed as 

2 2 
C = -fx-' In^-i + 2/i-i - /i" V - /i-S - "" ~ ^ , (15) 

where /x and cr^ are the constants in M. We can simplify the expression in 



(14) to get (12) above 



Remark 1.3. We note in analogy with [13] and [TOj that most records occur 
close to the depth where most vertices are, i.e., ~ /i~^ Inn for split trees. Also 



in analogy with [13] and [ID], from Lemma 2.4 and the proof of Theorem 2.1 
it follows that most of the random fluctuations of X„(T") can be explained 
by the values at depths close to In In n. 

Remark 1.4. For random trees [IDJ, E(Xe(T))=E( '^^^^ jf^ ) (where a is 

the root) and E(X„(T)) = E( ^^ dM+i ) ■ Thus, as we noted for the specific 
case of the binary search tree ^^ Remark 1.3] also for all other split trees 

E(.Y.(r»)) - E(A-„(n) = e( 5: ,,,J,^J - 1 ~ C, - 



^^^d{umv) + l)J - ^'log^n' 



for some constant Ci > 0, while there is no similar difference in the limit 
distribution, see Theorem 1.1 above. As in |T0], this behaviour suggests that 
it is impossible to use the method of moments to find the record distribution 
for split trees as one could do for the conditioned Galton- Watson trees in 
[H]. In [To] we instead used methods similar to those that Janson used for 
the complete binary tree in [13]. In this paper we generalize the proofs in 
[To] to consider general split trees. 



Remark 1.5. Most likely the method that is used here should work for 
other trees of logarithmic height as well, and thus the limiting distribution 
for these trees should also be infinitely divisible and probably also weakly 
1-stable. This turns out to be the case for the random recursive tree (that 
is a logarithmic tree), where the limiting distribution of Xe{T) was recently 
found to be weakly 1-stable, see [5l Theorem 1.1] and [121 Theorem 1.1]. 
However, the methods used for the recursive tree in [HI [12] differ completely 
from our methods. The advantage with studying split trees compared to the 
whole class of log n trees is that there is a common definition that describe 
all split trees and this is the reason why we only consider these trees in this 
paper. 

1.4 Renewal theory applications for studies of split 
trees 

1.4.1 Subtrees 

For the split tree where the number of balls n > s, there are sq balls in 
the root and the cardinalities of the b subtrees are distributed as (si, . . . , si) 
plus a multinomial vector (n — sq — &Si, Vi, . . . , H). Thus, conditioning on the 
random V -vector that belongs to the root, the subtrees rooted at the children 
have cardinalities close to nVi, . . . , raVJ,. This is often used in applications of 
random binary search trees. In particular we used this frequently in |10j . 

Conditioning on the split vectors, Uy at depth d, is in the stochastic sense 
bounded from above by 



Uy < Binomial(?7,, TT Wr^v) + Sid, (16) 

r=l 

and bounded from below by 

d 

> Binomial (n, TT Wr^v) — sd, (17) 



r=l 



10 



where Wr^v,'^ ^ {^,---,d}, are i.i.d. random variables given by the spht 
vectors associated with the nodes in the unique path from v to the root, see 
[3] and [11] . This means in particular that Wr^v = V. An application of the 
Chebyshev inequality gives that n^ for v at depth d is close to 



M" := nW^,,W2. ...W, 



d,vy 



see [H]. Since the nj,'s (conditioned on the split vectors) for all v at the same 
depth are identically distributed, we sometimes skip the vertex index of Wr^^ 



in (16) and just write Wr. 



1.4.2 Results Obtained by Using Renewal Theory 

In [TT] we introduce renewal theory in the context of split trees, and in this 
study we use this theory frequently for the proof of the Main Theorem, i.e.. 
Theorem 11.11 below. 

For eac 
in Section 



1 vert ex v, where Wr^v = V are the i.i.d. random variables defined 
let Yk^v '■= ~^r=i^^^r,v Below we skip the vertex index 



1.4.1 



and just write Y^, since for vertices v on the same level k the V/c^^j's are 
identically distributed. This is the corresponding notation, as the one we 
use in ^U\ for the specific case of the binary search tree, where we define 
Yk := — ^,,=1^'^^^' where Ur are uniform f/(0, 1) random variables. Recall 



from (18) in Section 1.4.1 that the subtree size n^ for a vertex v at depth k 



is close to M" and note that 

M; := nW,,,W2,. . . . Wk,v = ne'^^ 

Recall that in a binary search tree, the split vector is distributed as (f/, 1 — U) 
where U is uniform ?7(0, 1) random variable. For the binary search tree, the 
sum 'Yl,r=i^'^^r is distributed as a — r(A;, 1) random variable. For general 
split trees we do not know the common distribution function of 1^, instead 
we use renewal theory. (For an introduction to renewal theory, see e.g., [SI 
Chapter II] or p].) We define the renewal function 

oo oo 

Uit) = 5^6'=P(n < t) = Y,Pk, (19) 

fe=i fc=i 

and also denote F{t) := Fi(t) = 6P(— InVF^.v < t), which in contrast to 
standard renewal theory is not a probability measure. For U{t) we obtain 
the following renewal equation 

oo 

U{t) = F{t) + ^(Ffc * F){t) = F{t) + {U * F){t). 



k=l 
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Recall the definitions of the constants jjl and cr in ([I]). In [HI Lemma 3.1] 
we show the following result which is fundamental for the proof of Theorem 
|1.1[ Let t — 7- oo, then the renewal function U{t) in ([19]) has the solution 

(20) 







U{t) = 


(/.-i + o(l))e' 




In 


|l Ij we also define 










W{x) := r 
Jo 


e"*([/(t)-/.- 


'e')dt, 


and in 1 1 1 Corollary 3 


.2], we show that 






W{x) 


2/x2 


-/i"' + o(l), 


as X ^ cxo. 


2 


Proofs 









(21) 



2.1 Notation 

Most of our notation are similar to the ones that we use in [10] , where the 
binary search tree is considered. 

We use the notation log^ for the 6-logarithm (recall that a split tree with 
parameter 6 is a 6-ary tree) and In for the e-logarithm. Let {x} = x — \_x\ be 
the fractional part of a real number x. We treat the case X^(T") in Theorem 



1.1 in detail and then indicate why the same result holds for Xe(T") too. 
From now on since it is clear that we consider the vertex model we just write 
X(T"). First let X{T'^)y be X(T") - 1 conditioned on the root label A^ = y. 

We write = for equality in distribution. 

We say that, F„ = Op{an) if a„ is a positive number and Yn is a random 
variable such that Yn/an — )■ as n — )• oo. 

We say that, Yn = (9Lp(a„) if a„ is a positive number and Yn is a random 
variable such that (E{Yn^))p < Can for some constant C. 

We sometimes use the notation m = /i~^ Inn. For simplicity in the proofs 
below we write Inn when we mean max{l,lnn}. 

In the sequel we write T instead of T". 

For a vertex t> G T, we let T„ be the subtree of T rooted at v. Recall that 
riy is the number of balls and similarly let Ny be the number of nodes in T^. 

We write Exp(^) for an exponential distribution with parameter 6, i.e., 

the density function f{x) = ^^. We can without loss of generality assume 
that the labels A„ have an exponential distribution Exp(l). As mentioned 
above this does not affect the distribution of X(T"). 

12 



Let d{v) denote the depth of v, i.e., distance to the root. 
Recall that K is a random variable distributed as the identically dis- 
tributed components in the split vector V = {Vi, . . . ,Vh). Also recall that for 

each vertex v we let Yk^v '■= —'^r=i^^^r,v, where Wr,v = V are the i.i.d. 



random variables defined in Section 1.4.1 Since the Yf^^^s are identically 



distributed for vertices at the same depth (or depth), we sometimes skip the 



vertex index and just write Y^. Recall from (19) that we define the renewal 
function U{t) := ^"^^^ b''P{Yk < t). 

Let A^,. be the minimum of A^, along the path P{vi) = a, . . . ,Vi, from 
the root a of T to fj, 1 < i < b^, where Vi are the vertices at depth 
L = L/31og^lnnJ for some constant p. Thus, the definition of A„. and the 

assumption A^ = Exp(l) give A„^ = Exp(-j-^). 

For simplicity we write Tj := T^,^, n^ := n^^, Ni := N^^ and Aj := A^-. We 
also let Tj„ denote a subtree of Ti rooted at v (note that Tj„ is T„ for t> G Tj). 
Let nj„ denote the number of balls in Tj^. 

We write di{v) := d{v) — L (i.e., the depth in the subtree Tj, i G 
{1, . . . , b^}, of a vertex v & Ti). 

We say that a vertex v in T" is "good" if 

/i'Mn n - ln°-^ n < d{v) < /i"Mn ra + ln°-^ n, 
and otherwise it is bad. In particular a vertex f G Tj is "good" if 

/i"^ In rii - ln°-^ rij < di{v) < jj'^ In rii + ln°-^ rij, (22) 

and otherwise it is bad. 

We define v9(Tj, Aj) := E(X(Tj)aj | Tj, Aj) (the conditional expected value 
of X{Ti)/^- given the tree Tj and Aj). (We can think of X(Tj)a, as X(Ti) — 
1 conditioned on the root label A^,. = Aj.) Similarly we let ip{Ti,Ai) := 
Var(X(Tj)Ai | Tj,Aj) for vertices fj, 1 < i < b^ (the conditional variance of 
X(Ti)A- given the tree Tj and Aj). 

The conditional expected value of a random variable Z given the subtree 
size Hi of Tj is denoted by E„^(Z) := E(Z | nj). 

We write ^^ := ""^"^ '"" • e"^"''"' ^"", which is used in the later part of the 
proof when we consider triangular arrays. 

We use the notation Q^ for the a-field generated by {n„, d{v) < L}. 
Finally, we write '^j as the a-field generated by the V vectors for all vertices 
V with d{v) < j. Equivalently, this is the cr-field generated by {VTr,!), r G 
{1, 2, . . . , j}}, for all vertices v with d{v) = j. In particular we use that the 
subtree sizes {n^, d{v) < L} up to small errors are determined by the a-field 



fi, this follows because of the representation of subtree sizes in Section [1.4.1 
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2.2 Expressing the normalized number of records as a 
sum of triangular arrays 



Recall from Section 2.1 that we define ip{Ti,Ai) := E(X(Ti)A- | Tj, Aj), where 
Ti is the subtree rooted at Vi at depth L and Aj is the niinimum of A^, in the 
path from Vi to the root a of T. 

Lemma 2.1. For all subtrees Tj rooted at Vi with d{vi) = L, conditioned on 
the subtree size rii, 



y^{Ti,Ai 



Ni, 



H ^InUi 



:i-e 



-{fj. llnni)Ai^ 



T{Ti)-fi-'Ni\nni 



II- 



-2 1n2 



Ui 






/i- 



-3 1n=^ 



rii 



1 2.2 

m n,- 



, (23) 



good vGTi 

where T(Tj) is the total path length of the tree Ti, and the good vertices v &Ti 



are those with di{v) satisfying (22). 



Proof. Let for each vertex f G Tj, J^, be the indicator that A^, is the minimum 
value given Ti and A^. We get !f{Ti,Ai) = Y^v^v, E(/i,). If di{v) = j in Ti, 
let Vi,Vii, ...,Vij = f be the vertices in the path from the root fj to v. Then, 
I^ = 1, if and only if, A^. . < Aj and A^,^^ > Aj,. . for fc e {1, . . . , j — 1}. Since 
the Atj's (for all vertices v in Tj) are independent Exp(l) random variables 



E(/. 



JJP(A^^fc > x)e~''dx 



k=l 



e'^'^dx = 

J 



(24) 



Thus, 



'^{Ti,Ai 



E 



I _ Q-di{v)Ai 

di{v) 



Expanding ;77-^ for arbitrary good v ETi gives 



diiy) /i-Mnnj ^i-'^Xt? rii 

^ ^f \ {di{v)-^l-^\xinif 
V In^ Hi 



di{v) - fi ^ In Hi {di{v) - fi ^InUiY 



/i- 



-sin^ 



Hi 



Recall from Section 1.2.21 t hat the number of bad vertices in Ti, i.e., those 

and can thus be ignored. Thus, 



that are not in the strip in (22), is O^i 



In^ n, 
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sununing over all nodes v & Ti gives 



r— = y — 

^d,{v) f- di(v) 

v^Vi good v^Vi 



+ 



Li 



rii 



N,. 



In nj/ yU Mnuj 



/i~2 In^ rii 



+ E 



((ii(i;) - fi ^Innif 



good vGTi 



/i^ 



'3 1n=^ 



Ui 



rii 






(25) 



Now we prove that 

y- 

which obviously implies 






Ui 



(26) 






(27) 



For simpler calculations we show the bound in ( 26 ) by considering, e~^' Lm in i 
instead of e~'^^ \nni)\i^ That one can do this is because multiplying the Tay- 
lor estimate in (25) by e^'^^'-M innij^ gives the same expression up to the error 
term Oli 



In rii 



and 



as multiplying by e '^^ \an^)\i^ Yqx j > o, 



g{-[^ ^lnni\-j)Ai ^ g-AiLA* ^InriiJ _|_g(-LM Mnn,J+j)Ai /•g-2jAi _g-jA,^ 



Since we only have to consider the good vertices it is enough to show that 

Qi + Q2 = Oli(-^), (28) 

Vln ■ rij/ 

where 

1 



LlnO-''n,J 






[//-MnnJ - j' 



LlnO-«n,J 

* - E E « 



(-L/i"Mnn,J+j)Ai _ Cg-2jAi _ g-jA^ 



[/i MnriiJ +j' 
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We have 



[/i-MnnJ -ln"-^n, 



(29) 



and similarly 



Q2 = N,0 



ln°-6n,A, 



im\(,{-ll^-^ lnn,J+lnO-6 n,)Ai 



./i 1 Inn, 
Since Aj is Exp(-j^) random variable, we get that 

Jo 



(L + l)ye(~l-M"^lnn»J+lnO-6n,-(L+l))y 



- [/i-1 In riij + ln°-^ rii - L - 1 

L + l 

([/i-ilnriij - \n°-^ rii + L + 1) 



dy 



Thus, (28) holds and it follows that (27) is satis fied. 



Cii(r^). Hence, 



Now we show that ( 27 ) implies ( 23 ) in Lemma 



2.1 



We have e-^'' iinn,)A, 



-(/^ llnni)Ai 



E 



((ii(t;) - fi ^Inriif 



good vGTi 



/i 



-3 1n=^ 



O 



Li 



ni 



ri,- 



Inn,- 



Recall from Section 



1.2.2 



that the number of bad nodes in Tj is C'li(j^^^' 



and that for any constant r there is a constant C > such that the number 
of nodes with d{v) > C Inn is Oiii^j. By using these facts we get an 
obvious upper bound of the total path length, i.e.. 



T{Ti)-fi-'Ni\nni \< Ndn'-^ m + Ol^ 



rii 



Hence, 



iT{T,)-^I-^N,\nn^)e 



-{fi Mnni)Ai 



fX- 



-2 1n2 



O 



Hi 



Inn,- 



ra,- 



L' \ , 2.2 
m n,- 



and Lemma [2.11 follows. 



n 



Recall from Section 2.1 that we define '?/'(Tj, Aj) := Var(X(Tj)A. | Tj, A,), 
and that we write E„^(.) := E(.|nj) for the conditional expected value given 
rii. 
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Lemma 2.2. For all vertices Vi with d{vi) = L, conditioned on Ui, 

Vln n,/ 

Proof. For all vertices v & Ti, let I^ be the same indicator as in the proof 
of Lemma 2.1| above. Suppose that v and w are two vertices in Tj at depth 



di{v) = j, di{w) = k with last common ancestor at depth di{u) = d. Suppose 
first that d < j,d < k. Let {vi,Ui, . . . ,Ud = u} he the vertices in the path 
from Vi to u and let Z = min{A„^ : 1 < s < d}. Conditioned on Z, /„ and 
Iw are independent. Let Z A Aj denote the minimum of Z and Aj. Since v 



has depth j — d above u, (24) yields 

E{h I Z) = -— , 

J -a 

and similarly for I^. (Compare this with [TOl Lemma 2.2].) As in [10 
equation (18)], 

E(/./.) = -^AtA^- e-"^' - -(1 - e"^-^') - jil - e-'^^') + 
J — d k — d\ J k 

n _ Q-iJ+k-d)A,\ j^ ^-dki _ g-jAi _ g-fcAi _^ ^-{j+k-d)K I ^ /gg-j 

J ~\~ rC d 

The covariance of ly and /t^; is 

Cov(4,/J = E(4JJ - E(/,)E(/J. 
We say that a pair {v,w) is "good" if j and k satisfy 

/i~^ Innj — In ■ rii < j,k < fi^^ Inrij + In ' n,, 



and otherwise it is "bad". From [131 equation (7)] by (24) and (30) above, 
for a good pair 

Cov(4, JJ = le-(^+^-'^)^-(l - e-'^^O + 0(-^) = O^J-^). (31) 
jk Vln'^rij/ Vln'^nj/ 

(Compare this with [lOl equation (19)].) Since the number of bad vertices 

2 

is C'2,i(j^^^) it follows that the number of bad pairs, is C'li(j^^^). Hence, 
because of the obvious upper bound that Cov(/^, I^) is at most 1, the sum 
of covariances for the bad pairs is O i :^^^ ) . Thus, 

2 

E„,(^(T„A,)) = E„,( Yl Cov(4,/j)+o(^). (32) 

good {v,w)£Ti 
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Recall that ^j is the a-field generated by the split vectors for all vertices 
V with d{v) < j. Recall the representation of subtree sizes in split trees 



described in (16) in Section 1.4.1 Recall that rii^ denotes the number of 



balls in the subtrees rooted at t; for w G Tj. From (16) we get that for v, 
where dAv) = d. 



En,ini^\^L+d) <niY[Wr + Sid. 



r=l 



Thus, 



ly-t . 

EnA^iv) < ni[[E{Wr) + dsi = -^+dsi. 



r=l 



Again by using (16) we get that 

d 



E„,(n,2|^i+d) = n^ II W^ + 0{ndl[Wr) + 0{d^ 



r=l 



r=l 



Thus, 



E„An^) < n^l[EiWr') + 0('-^)+ Oid'). 



■'ni\"'ivJ 



r=l 



riid 



(33) 



Note that E{W^) < E{Wr) = I since Wr G [0,1]. Hence, there is an 



e > such that the right hand- side in (33) is bounded by 

i2 



nj 



(. + e)^+^(l^)+^(^^)- 



(34) 



From (|32]) by using (jMI, (|33]) and (jM]) 

n? ■b'^-d 



E„,MT^,A^))=0(^J2 



+ 



o( 



n^ 



{h + tYhf Ui^ Vln'^ni/ Vln"^ nj 



o(. 



n: 



n 



The estimate in Lemma 2.2 is used in the proof of the following result. 



Lemma 2.3. In a split tree T", let Vi,l < i < b^ , be the vertices at depth 



L = [_(5 logfe In n\ choosing (5 > 



1 



■iog(,E(y2)_r 



Then 



X(n = ^y.(T„A,) + Op( 



n 



In n 
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Proof. We write the number of records as {P* + Pi + . . . + Pfei}, where P* is 
the number of records with depth at most L and Pi is the number of records 
in the subtree Tj rooted at depth L, except for the root Vi. Let ^l be the 
cr-field generated by {A^, : d{v) < L} and ^£ the cr-field generated by T" and 
^L- We also note that E(Pj | JFl) = (p{Ti,Ai). By the same calculation as 
in [ini equation (22)], 



E( (x(n - p* - 5^y,(T„A,)) ^2 = E^(^-^^)- 



(35) 



Taking the expectation of the conditional expected value in (35) yields 



E( (X(T")-P*-5^^(T„A,)) ) =5^E^(T,,A, 



(36) 



We observe the obvious fact that the sum of those nj, i G {1, . . . ,b^}, 
that are less than pr^ for fc large enough, is bounded by 



n 



jjkL Vln^n 



n 



(37) 



(Note that by choosing k large enough in (37) the power of the logarithm 



can be taken arbitrarily large.) Lemma 2.2 and (37) give that 

i:En.(^m,A.))=o(i:^). 



(3^ 



4 = 1 2=1 

(Compare this with [TOl equation (25)].) The expected value of the sum in 



(|38j) is equal to the expected value of the left hand-side in (36). From the 
calculations in (33) above for i G {1, . . . , b^}, 



E{nf) < n^EiV'))"^ + 0{nL). 



(39) 



Hence, choosing (3 > —^^ — ^,y2-)„i one gets from (39) that 



i=l 



Tnn 



and thus the left hand-side in (36) is o(r^). Thus, Lemma 
the well-known Markov inequality. 



2.3 



(40) 

follows from 

D 
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Applying Lemma 



2.1 



and Lemma 



2.3 



yields for (3 > 



that 



^(^") = E b^ 



2N, 



]\[.Q-{fJ- Mnni)Ai 



-■ \/i~Mnr2j /i^^lnrij 



good v&Ti 

T(T,) 



iog,E(y2)-i 
(di(t;) -/i-Mnraj)2 



-3 in3 , 



/i"-" m n. 

Or' 



P\ 1„2 



In n 



, (41) 



where we used that the Markov inequality gives Oli i ^^2.2^ / — ^p\in-n 
In pTl Corollary 2.2] we prove that 



E E 

i=l good DSTi 



{di{v) — fi -"^Inrij)^ a'^an 



jji~^ \v? n. 



+ 0.. 



1 2 ' "^p \ 1 2 
m n Mn n 



ra 



(42) 



We get for rii > 



n 

lykL 1 



E. 



g-(^ ^\nni)Ai _ ^-{/i ^\nn)Ai 



L + 1 



L + 1 



L + 1 + yU~^ In nj L + 1 + /i"^ In n 
O 



2 „ /' 



In n 



and it follows that 

E 



/i~^lnnj /i~^lnn 



O 



b^ In^ n 



(43) 



Again we use the bound in (37) for those rii < -n^ (for large enough k) 



so that we can ignore them in the sums in (41). Thus, by (42) and (43) with 



another application of the Markov inequality, the approximation in (41) can 
be simplified to 



6^ 



2Ni 



b^ 



T(r,) 

^ /i"^ In Hi ^ /i'^ln^ n, 
b^ 



^(n^E^^^-E 



a'^an 



u~^ Inn ^-^ 
(Compare this with [ini equation(27)].) 






1 2 ' "^P V 1 2 

m n Vln n 



n 



■ (44) 
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By choosing (3 large enough we can sharpen the error term in (40), i.e., 

(45) 






n, 



i=l 



n 



In'^n/ ' 



for arbitrary large k. Applying (45), the variance result in (|6]), and assuming 
(A3), Chebyshev's inequality results in 



*^ AT *^ 

In ti- ^-^ 



i=\ 



i=\ 



Inn, 



+ 0. 



n 



In n 



(46) 



The third sum in (44) is treated similarly. For simplicity (in the calculations 
below) we change the notation A^j, 1 < i < 6^, to N^^ d{v) = L, and similarly 
for Hi, 1 < i < b^. Hence, from (44), for /3 large enough, we get 



■^(r") = E ^"" 



^/i-Mnrij ^ fi'Hn n, 



E 



T(T,) 



u ^ In n -^^^ 






m n Vln n 



n 



Lemma 2.4. Lei L = [/31ogjlnnJ for some constant (5, 



E 



n,;e 



-(^ -*- ln?i)Ai 



5^ n,e-('^-'"^")^"+o/ "" 



d{v)<L 



Inn 



r/ins, choosing (3 > _iog^E(y^)-i /'^'^"^ ('^'^^' 
b^ 



XiT") = E ^"" 






T(r,) 



^ /i Mnnj -^ /i-21n^ 

1 
/x~i Inn 



n,- 



1 x:^ i'/,-iitir,u (r'^an 



(47) 



d{?;)<L 



In n 



n 



In n 



Proof. Recall that we write m := /i~^ Inn, and Aj for the minimum of the 
L + 1 i.i.d. random variables A„, v G -P(f «) = {<J, ■ ■ ■ , f «}, where P(f j) is the 
path from the root a to Wj. Thus, e~"^^^ is the maximum. Now we define 
Aj as the j-th smallest value in {A^,, v G P(fj)}, so that e~™^> is the j-th 
maximum. Note in particular that A^^ = A,. Choosing a = ^'""^ gives that 
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for some i, the probability that at least [/3J + 1 of the A^'s, v G P{vi), are 
less than a is 



m 



L/3J+1 



Thus, with probability tending to 1, there are at most [/3J values A^, less than 
a in each P{vi), giving for each i, 



v£P{vi) j=l 



^^^ ■ T-\R\ 

< ^ e-™^" _ ^e-™^' <{L- [/3J)e-™" = - — f^ 



Hence, using that n^ - sh^ < '}2i:veP{v,) ^i ^ 
h^ L/3J h^ 



rir. 



-mX„ 



e ■•"■" +0r 



1=1 j=i i=i t)eP(Di) 



n 



{v)<L i:vGP(vi) 



Ann 

d{v)<L 

Observing that the second smallest value Af in i : f G P{vi), is at most x if 
at least two A^ are at most x, and using that the A^,'s are i.i.d. we calculate 
the distribution function of A^ as 

pUf <x\ =1- P(A„ > x)^ - LP(A^ > a;)^~^P(A^ < x) 
= 1 - e~^^ - Le-(^-^)^(l - e~^). 

Hence 





l-oo 

-- / e-'"^((L-L2)e-^^ + L(Z 
Jo ^ 




= r^+ . . . =0 — 



m + L m + L — 1 Km?)'' 

implying 



'' ''' ,nL\ 



J2n.J2e-'''=0, 



i=l j=2 
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Thus, the Markov inequahty gives 



^n,e-(^-'''^")^'= J2 ^.e-('^-'''^")^"+o/ "" 



j=i 



d{v)<L 



Inn 



n 



Thus, from Lemma 2.4 (where {3 is chosen large enough), by applying 
(46) and the total path length result in ^ we get 



^(^") = E ;;3i 



an^ 



-1 In n •^— ^ 



d{v)=L ^ V r- d(v)<L 



an^e 



-{fi -*- lnn)A^ 



(n 



ana 

+ ^^5 \- O. 



jj'"^ In^ n 
n 



, 2 ■ -ni 2 /• (48) 

m n vin n^ 



As in [13] and [TO] the proof of Theorem |1.1[ i.e., the main theorem, will 
be completed by a classical theorem for convergence of triangular arrays to 
infinitely divisible distributions, see e.g., [161 Theorem 15.28]. First we recall 
the definition of 



e.:= 



mn 



f „-mA„ 



n 



(49) 



in Section 2.1 Normalizing X(T") gives by using (48) 

X(T"; 



an \ 



an 



an In In n Cn 

+ 



/i Unn /i^Mn^n /i"2in2^ 
> ^„ + > ^- /i Inlnn 

d{v)<L d{v)=L ^ 

— ijT^ Inn + jjr'^a'^ + Op(l). 



(50) 



Let 



-2 1 2 

jj, In n -^-^ 7 



n,: 



n 



(^)=L 



/i ^ In n^, 



/i ""^ Inlnn — /i ^Inn + fx ^o"^. (51) 



and ^i = -^- Thus, 



-21^2, 



an 



an In In n C'^ 

+ 



/^ 111 ^ (xm) - 

an \ fi-^\nn fi'^ln^n ' /x-^ln^n 

n 
(i{i;)<L i=l 



(52) 
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As in [To] since the n^'s in the sums in (50) are not independent (although 
they are less dependent for vertices that are far from each other), {C,y} |J{^j} 
is not a triangular array. Recall the definition of Ql as the a-field generated 
by {uv, d{y) < L}. Hence, conditioned on Ql, {^v}[j{^i} is a triangular 
array with C,i conditioned on Ql deterministic. 



2.3 Applying a limit theorem for sums of triangular 
arrays 



2.3.1 Theorem 2.1 which proves Theorem 1.1 



As in [13j and [10], the proof of Theorem |1.1| will be completed by a classical 
theorem for convergence of sums of triangular arrays to infinitely divisible 
distributions, see e.g., [161 Theorem 15.28]. For the sake of independence we 



intend to condition on the n^'s in the sums in (52 ). We show that conditioned 



on the n^,'s we get convergence in distribution for the normalized X(T") 
to a random variable W with an infinitely divisible distribution, which is 
not depending on the n^'s we conditioned on. Then it follows in the same 
way as in [10] that also unconditioned the normalized X(T") converges in 



distribution to W. The main Theorem 1.1 is proven by Theorem 2.1 below. 



Theorem 2.1. Choose any constant c > and let n ^ oo. Conditioning 
on the a- field Ql, where L = [/3 logf, In nj , if the constant (3 is chosen large 
enough the following hold: 



i) supP(^„ > s|f]i) 

V 

Ai: 



for every a; > 0, 



[11 



{v)<L 



V[X, oo 



/i 



-1 



X 



for every x > 0, 



(m) A2 := Y. E(^a[e. < c]\^l) - 

d{v)<L 

+ ijT^ Inn — /i^^cr^ 



-2 ^ 2 

jj, m n 



n 



d[v)=L 



Ur. 



yU ^XnUy 



+ jj, ^ In In 77, 



P > -1 1 -1,-1 -2 2 

— > —fj, mfj, +12 — jj, a 

d{v)<L 



a 



/i^ 



2/^2 



+ yU ^ In c. 



Before proving Theorem 2.1 we will show how it proves Theorem 1.1 
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Recall from (51) that 
D 



/^" 



"^Iv? n 



n 



E 



n„ 



/i ^ lnra„ 



ji ^ In In ri — /i ^Inn + yU ^o"^. 



d(t))=L 

We apply [T6l Theorem 15.28] with 



-1 



0, h = — /i InyU " + yU 



-1 



-2 2 



a 



A* 



to X]d(i;)<L ^i' + Sr=i ^i conditioned on VLl with it = -^ deterministic. The 
constants a and h are the constants that occur in the general formula of the 



characteristic function for infinitely divisible distributions in (|13|). Note that 

D 



^ — )■ 0, thus because of (z), conditioned on VLl, {^^} IJI^i} is a null array. 
We define S{n) := J2div)<L^v + Y17=i ^i- From (ii) we have that 



di/ 
dx 



2 ) 



hence 



x^(iz/(x) 



/i ^dx = jji ^c and / xdiy(x 



X 



dx 



-/i 



-1 



Inc. 



Thus, the right hand-sides of (m) and (iv) are b — J^ xdv{x) and Jq x'^dv{x) 



respectively, where h is the constant in (53). The convergence in Theorem 2.1 



is in the probabilistic sense, while [ini Theorem 15.28] requires usual conver- 
gence, i.e., standard point-wise convergence of sequences with no probability 
involved. However, if the convergence instead were a.s. in Theorem |2.1| then 
it would have been easy to see from this theorem that conditionally on VLl the 
conditions of P^ Theorem 15.28] are fullfilled for S{n). Thus, assuming a.s. 
convergence in Theorem |2.1[ [TBI Theorem 15.28] implies that conditioned 
on VLl, 



S{n) — )■ ly, as n — )■ cxD, 



(54) 



where W has an infinitely divisible distribution (in particular a weakly 1- 
stable distribution in this case) with characteristic function 



E e 



AtW 



exp ith + 



Atx 



itxl[x < V\)dv{x^^ 



this is (14) in Remark 1.2 
to (12) in Theorem 1.1[ 



l^since b = C + fi ^(7 — 1)) which can be simplified 



It follows from (54) that conditioning on Q^ has no influence on the 



distributional convergence of S{n) (unconditioned), since for any continuous 
bounded function (7 : M — )■ M, 

E(^g{S{n)) I Ql) = J gdF{S{n) \ Q^)) "-^ E(^g{W) 
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Thus, taking expectation by dominated convergence 

E(^(5(n)))"-^E(^W). 

This shows that also unconditioned S{n) — )• W. Thus, unconditioned the 



normahzed X{T"') in (50) converges in distribution to —W. 



It remains to show t hat c onvergence in probabihty (which is the type of 



convergence in Theorem 2.1) actually is sufficient for S{n) -^ W to hold. In 



|10j we proved this fact for the binary search tree in two ways, in one by using 
subsequences and in the other one by using Skorohod's coupling theorem, see 
e.g., [T^ Theorem 3.30]. By analogy these proofs also work for general split 
trees. Thus, the proof of Theorem |1.1| for Xy{T) is completed. 

Now it follows easily, by the same type of argument as for the binary 
search tree [10] that the result holds for Xe(T) too. One way to see this is to 
consider T as the tree T with the root deleted. Then there is a natural 1-1 
correspondence between edges of T and vertices of T, and this correspondence 
also preserves the record (and cutting) operations. Since it is very unlikely 
that the root value would decide if values at high levels are records or not, 
it follows that asymptotically Xe{T) and Xy{T) have the same distribution. 
Thus, the proof of Theorem |1.1 is completed. 



The idea of the proof of Theorem 2.1 is as for the binary search tree 



pUl Theorem 2.1] to use Chebyshev's inequality to prove {ii), {in) and [iv) 



of Theorem 2.1 ((z) is very easy to prove). For the binary search tree we 
frequently used in [101 Theorem 2.1] that the sum ^^^ilnf/j., where Ur 
are uniform U{0, 1) random variables, is distributed as a — r(A;, 1) random 
variable. For general split trees, the solution of the renewal function U{t) in 



(20) is fundamental for the proof of Theorem 2.1 



2.3.2 Lemmas for the Proof of Theorem 12.11 

Recall that we write Qj for the cr-field generated by {n^, d{v) < j} and ^j 
for the (T-field generated by {H^r,^;, ^ G {1,2... , j}}, for all vertices v with 
d{v) = j. Also recall that we write L = [/Jlog^lnnJ. We also write 



k 

mn. 
n„ ■-= n 

" " • ' ' n 

r=l 



r[Wr,., and e;:=^^e-™^", (55) 

J. J. ' r} 



where m := ^ ^ Inn. Note that ^^ is thus equivalently the cr-field generated 
by {n^ : d{v) <]}. 

We present below four crucial lemmas by which we can then easily prove 
Theorem 12.11 
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Lemma 2.5. Suppose that n — ?■ oo and choose any constant c > 0. Then for 
L = \_P logf, In n\ and (5 large enough, the following hold 

J2 P{^.>x\nL)= Yl P(f. >^|^l)+op(i), 

d{v)<L d(v)<L 

Y, E(e.l[e. <c]|f^L)= Y. E(£i[£<c]|^l)+Op(1), 

d{v)<L d{v)<L 

E n^ _ n _ ^ n;in(^) ^ n ^ 

//-linn //-linn •^— ^ /,-llvi2„ ■ 



P^l„2„^' 



_^ ^ u-Mnn^ /i-Mnn ,;^^ u-Mn n Inn' 

d{v)=L d(v)=L 

Y Var(e.l[e. < c]\Ql) = Y ^^rglg < c]|^z.) + Op(l). 

d(t;)<L d{v)<L 

For simplicity we sometimes use a short notation for the following sums, 



I.e., 






u Unn f--^ /i^Mn n 

d(v)=L 

Ri:= Y P&>x\'^l), (56) 

d(t))<L 

i?2 : = 5^ E(£l[£ < cpL) - ^^^^ ■ $., (57) 

d{v)<L 

R3:= Y Var(e;i£<c]|^i). (58) 

d{v)<L 

Lemma 2.6. Suppose that n — ?> oo and choose any constant c > 0. T/zen /or 
L = [/3 logf, In n\ and (5 large enough, the following hold 



E(i?i j = h o(l) = iy{x, oo) + o(l), for every a; > 0, 

In n 

2/^2 



2 2 

E(_R2 ) = — /i~^lnn — /x~^ Inlnn + /x^^ — /x^^ In/x^^ + /x~^ Inc — ^ h o(l) 



E^i^s) =/i-'c + o(l). 
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logbln?^ 



Let / := [ °^''^"" j and for short write 



l<d{v)<L 

s2--= Yl HMl < ^Pl) - ^^—^ ■ <^, 

l<d{v)<L 

Ss:= Y. var(e;i[e;<c]|^i). 

l<d(v)<L 



n 



(59) 

(60) 
(61) 



Lemma 2.7. Suppose that n — t- oo. Then for L = [/^log^lnnj (where /3 is 
large enough) and I = [ °^''^°" J, the following limits hold 



(62) 
(63) 
(64) 



Lemma 2.8. Suppose that n — )■ oo. Then for L = [/Jlog^lnnJ (where /3 is 
large enough) and I = [ °^''^"" J the following limits hold 



v 


;(5i 


V 


)[S2 


V 


;(^3 






^J -> 0, 



ETvarr^i 
E(^Var(52|^i)) ^0 



(65) 
(66) 
(67) 



Before proving these lemmas we show how their use leads to the proof of 
Theorem 12.11 



E Var 5: 



^i)) ^0. 



2.3.3 Proof of Theorem \2A\ 

Recall that m = fi"^ Inn. For any a; > 0, and v with d{v) < L, we have 



nx 



P(^. > x|fii) = P(e-^^ > -^\n,^) = P(A. < iln^^lO^) 
' muv m nx 



/ 1 , rnn„x 

1 — exp ( ln+ ) . 

m nx 



Thus, for every x > 0, 



m 



P(^^ > x\nL) < — ln+ < — ln+ ^0, 

^ ' ^ m nx m X 



(6^ 



(69) 
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which proves (i). 



Recall the definitions of i?i, R2 and R3 in (56), (57) and (58). Note that 



Lemma 2.5 shows that in Theorem 2.1 the left hand-sides of (ii), (iii) and 



{iv), i.e., Ai, A2 and A3, respectively, are equal to 

Ai = i?i + Op(l), 

A2 = -R2 + Ai""*^ Inlnn + n~^ Inn — /i~^cr^ + Op(l) := R2 + Op(l), 

A3 = i?3 + Op(l). 



2.6 



shows that the expected values of i?i, i?2 and R^ converge to the 



Lemma 

right hand-sides in (ii), [iii) and (iv) of Theorem 2.1 



We complete the proof of Theorem |2.1| by showing that 
Var (Ri) -^ for every x > 0, Var {R2) -^ 0, and Var {R3) -^ 0. (70) 



Then by Chebyshev's inequality (ii), (iii) and (iv) of Theorem 2.1 follow 



Thus, it remains to show how (70) follows from Lemma 2.6 and Lemma 2.7 



By using (|69j), one easily obtains 

5^P(£>x|^l)= Y1 P&>x\Wl)+o(1), 

d{v)<L l<d{v)<L 

d{v)<L l<d(v)<L 

J2 Vargl£ < cpi) = Yl Varglfi < c^l) + o(l) 

d{v)<L l<d{v)<L 

Hence, 

R, = Si + o(l), R2 = S2 + o(l), Rs = Ss + o(l). 



(71) 
(72) 
(73) 

(74) 



To show (70) we use a variance formula that is easy to establish, see e.g., 

(75) 



[91 exercise 10.17-2], 

Var(X) = E(Var(X | ^)) + Var(E(X | ^)) 



where X is a random variable and ^ is a sub a-field. 

Recall that '^j is the cr-field generated by {PVr,i), r G {1, 2, . . . , j}}, for all 
vertices with d(v) = j. Consequently, by applying the variance formula in 



(75), from Lemma 2.7 and Lemma 2.8 we get as n — ;■ 00 



Var(5i) = E(Var(5i|^0) + Var(E(5i|^z)) ^ 0, 
Var(52) = E (Var(52|^z)) + Var (E(52|^0) ^ 0, 
Var(53) = E(Var(S'3|^0) + Var(E(^3|^z)) ^ 0, 
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and thus ( 70 ) follows from ( 74 ) 



We have proved Theorem 2.1 by the use of the lemmas, and 
thus also Theorem 11.11 

2.3.4 Proofs of the Lemmas of Theorem 12.1] 



Finally we present the proofs of Lemma 2.5, Lemma 2.6, Lemma 2.7 and 
Lemma 12. 8[ 



Proof of Lemma 2^. From (16) and (17) in Section 1.4.1 we get in particular 
that given '^l, 



n, 



n. 



< Binomial(n, TT Wr^v) + SiL, 

r=l 

k 

> Binomial(n, TT Wr^^) — sL. 



r=\ 



Since a Binomial (fc,p) random variable has expected value fcp and variance 
fcp(l — p), the Chebyshev inequality results in 



k 



(76) 



This motivates the notation of ri^ := 'T-nr=i^^.'' ^^ (55). Also recall 
that we write \^ := mibLQ-mX^ ^^j. ^ .^ /i~Mnra, and that ^j is the a-field 
generated by {n^ : d{v) < j}. By using (68) and (69) we get (compare with 
[TUl equation (55)]), 



{v)<L 

and similarly 



$:p({„>.in,)=x:5:-i"+— (i+o(— ) 

■^ — ^ ' ^ — ^ ^ — ^ m nx \ m 



(77) 



fc=l d{v)=k 



.lnr?7,^ 






d(i))<L A;=l (i{i;)=fc 

By using (ITTI), ([78| and ([76| we get 



vn nx 



Y, P(e. >a;|fii)= Y. P(f. >a;|^L)+Op(l). 

d{v)<L d{v)<L 



(79) 
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One easily gets (compare with [13], p. 251] and [TO], equation (61)-(62)]) that 



J2 E(ea[e. <c]|f^L)= Yl 



^y^^'O m + 1 I / mny \ 
g m "T" ^ nc ' 



d{v)<L 

and similarly 



d{v)<L 



Y, E(e;i£<c]|^i)= Y 



n{m + 1) 



TflTly _ m + 1 I / mnij ■, 



^0) 



(i(i;)<L 

Thus, ([76| implies that 



d(v)<L 



nira + 1"; 



51) 



5^ E(ea[^. < cW^l) = J2 HWl < cpL)+Opil). (82) 

d{v)<L d(v)<L 



Using the bound in (37) for the sum of the subtree sizes with Uy less than 



^ (for k large enough) we get the expansion 



//-I It 



n 






d(v)=L d(v)=L 



By again using (76) (compare with [ini equation (68)]) we get 






n 



2^ „^i- ^ +"" 



,^ ^ « Mnn^, /i Mnra _^^^ u^iln^n ^ In^n ' 

d(v)=L d(v)=L 

By using the calculations in [TSj p. 251-252] (compare with [TOl equation 
(70)]) we get 



Y Var(e.l[e. <c]|fii)= Y 

d(v)<L d(v)<L 






TTl Tiy 2m-\-l 1 / mny \ 

g m "•" ^ nc -^ 



+ 0(1) 



^3) 



and similarly 



Y Vargl£<c]|^L) 

i{v)<L 



E 

d(i))<L 



2^2 _ 

TTl fly 2m. + l 1 / mriij \ 

g m "1"^ nc -^ 

2mn2 



+ o(l] 



U) 



Thus, using (76) we obtain 



J2 Var(e„l[e. <c]|fii) 

d{v)<L 



J2 y^r{^.l[l<cpL)+Op{l). 

i{v)<L 



^5) 



n 
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Proof of Lemma \2.(^ Recall that we write 



n 



^\nWr 



(86) 



r=l 



and that we write 



^1= 5Z P(f. >a;|^L). 

d(v)<L 



As in the calculations in p^ equations (56)] from (78) one gets 



Li 

E(i?0 = (l + o(l))J]6^E( 



(In 771 — In a; — Y^ 



fe=i 



m 



■/{Yfc < Inm-lna;}). (87) 



By using integration by parts we get that the sum in (87) is equal to 

^ -I '■Inm— Ina; -i /"Inm— Inx ^ 



fc=i 



TTl 



P(Yfc < t)dt 



m 



J2b''P{Yk<t)dt. (88) 



fc=i 



Recall the definition of the renewal function U{t) := X]feli^^P(^fc < i!^) 
in (19) above. We want to show that 



m 



In-m— Inz °^ 



J2 b^^O'k < t)dt = o(l). 



59) 



k=L+l 



To show this we use large deviations. Choose an arbitrary s > 0, by applying 
the Markov inequality and using that the Wr^v, f^ G {1, . . . , k}, are i.i.d. we 
get 



P(n <t) = P(-n > -t) = P(e-^^'^- > e-^*) < (E(y^))'e 
Choosing s > 1, we get 

E{V') < E{V) = I. 



(90) 



Thus, we can find 6 > such that 



Hvn < - 



51+r 



(91) 
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In the definition of L = [/3 logf, In n\ tlie constant (5 can be cliosen arbitrarily 
large. It is enough to show that ^^ ^^Ll+i b^^iY^ < Inm — Inx) is o(l) 
for proving (89). By applying (90) and (91) we get that 



^ 6''P(n <lnm-lnx) < ^ 



k=L+l 



Jjk+Sk 



k=L+l 



x" 



k=L+l 



-5k 



m 



x" 



(92) 



gives that the quantity in (88) is equal to 

r"lnm— Inx 



Thus, choosing (3 > ^ in L gives (89). Now the solution of U{t) in (20) 



U{t)dt + o(l) = ^ ^ ' I e'dt + oil] 



m 



/i- 



X 



o(l) = uix, oo) + o(l). 



(93) 



Hence, E(i?i) = i/(x, oo) + o(l). 



In analogy with [TOi equation (63)]) by using (81) and (86) we deduce 



that 



e( J2 ^^vl[l<cpL))■■=E^ + E,, 



(94) 



d{v)<L 



where 



E, = Ey ^^e-^^e-=^(i°'"-i°^-^^)/m<lnm-lnc}, 
^^-^ m + 1 

d{v)<L 

E2 = E V ^ e~'^''/{Yfc > Inm-lnc}. 
^^-^ m + 1 

d{v)<L 



(95) 



By using integration by parts, applying the solution of U{t) in (20) and 



using (92) we obtain that 



El = e 



-2i±i{lnm-lnc) 



m 



lE''' 



m^.— j„ 



Inm—lnc 



e^dPiXk < t) 



iri+l(inm-lnc) 



m 



-(\J2b'e^PiYk<t) 
m + 1 \l ^ 



Inm—lnc 




Inm—lnc L ,^ 



k=l 



m 



/i-^ + o(l). 



(96) 
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By similar calculations as in (96) 



^ m m 

E2 = rL - 



171 + 1 7n+ 1 Jq 



Inm— Inc ^ 



J2b'e-'dP{Y,<t) 



k=l 



m 



m 



m + 1 m + 1 Vl^^ V fc - ; 

^ fc=i 

"Inm— Inc ^ 



Inm— Inc 




+ 



L- fi-^ - 



m 



m + 1 Jq 



k=l 
Inm— Inc ^ 



L 

J2 b''e-'P(Yk < t)dt + oil). (97) 



k=l 



From (92) it follows that 



Inm— Inc ^ 

^h^e~'V{Yk<t)dt 
fe=i 

Inm— Inc 

ttTTU\ ,.-l„t\r, I , -1 



Inm— Inc 



e~*f/(t)dt + 0(1) 







e~*(f/(t) - fi''e)dt + ii'\\nm - Inc) + o(l) 



Applying the solution of W{x) := /^^ e \U{t) — \i ^e^)dt in (21), from (97) 
we get that 



-1 -1 cr'^ — fi'^ 
E2 = L-fi lnm + /i Inc — ^ Ho(l). 



(9^ 



Recalling (94) and applying the approximations of Ei in (96) and E2 in (98) 



we deduce that 

d{v)<L 

which is equal to 



Inc ^^ + 0(1) 



2/i2 



2 2 
i^ : =L + /i"^ -;U~Mnlnn-/i"Mn/i"^ + /i"Mnc- ^ ^ !^ +o{l). (99) 



By the definition of n^ in (|55 



y- ^(f ) 



/i-Mnn f-^ u-Mn'^TT. 

d(v)=L 



n 



H ^ Inn 



E 



(«)=i 



yU~^ln^n 



(100) 
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Hence, by using the definition of /x in ([I]) we get that 

nL 



E $ 



n 



+ 



/i Mnn /i-2 1n^ra 



(101) 



Thus , recalhng the definition of i?2 in (57) we get E(i?2) = K — ji ^Inn — L, 

where K is defined in (99). 
Recall that 

Rz= Y. Var(e;i[e:<c]|^i). 

d{v)<L 



By using (84) we get that 

L 



E (i?3) = ^ ^E W Wl^e-^^'---^-''-'^~^^^nY,<'-m-Xnc} ^ ^(^^ 
k=l \r=l / 

= \/i + \/2 + o(l), (102) 



where 



14 := e 






m 



Inm— Inc ^ 



J2b''e^dP{Yk<t), 



k=l 



^2 := E E V n ^i^{^^ > Inm - Inc} 



(103) 



,k=l r=l 



By applying the solution of U{t) in (20), integration by parts results in 



v.= r y: "^--''dPiYk < t) 

J\nm—\nc 7,_i ■^ 



m 

~2 



fl ^C 



e-''U{t)\ +m e-''U{t)dt + o{l) 

llnm-lnC ./l>nrr,_1„^ 2 



Inm— Inc 



0(1), 

(104) 



where we used (90) (choosing 1 < s < 2) and then similar calculations as in 



(92) to show that if we sum over all k instead oi k < L the error term is just 



0(1). 



Similarly, by using (92), integration by parts gives 



^1 = ^ + 0(1)- 



(105) 

n 
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Hence, E{Rs) = fi'^c + o(l). 



Proof of Lemma 2.1. For a given vertex Vi G T with d{vi) = I, there are 
at most b^~^ choices of v at depth j with ancestor Vi. Recall that Yj^y := 
— "^l^i In Wr^y. For v with d{v) = j, we also write 



-'j—l,v • ^ j,'o ^l,v., 



J2 InW^r. 



(106) 



r=l+l 



Recall from (59) that 



Si= Yl P{1>x\^l). 

l<d{v)<L 



Using (71) and the solution of the renewal equation U{t) in (20) we get 



by similar calculations as in (|87|)-(|93|) 



-| flnm— In x—Yiy L 



E 



TTl 



J2 V-'P{Z,^i,y<t)dt + o{l) 



m 



I 



Inm— Inx— Yj „. 



^~'edt + o(i) 



i=l r=l 



(107) 



Recall from (60) that 



Thus, Var(E(Si|^z) ) is o(l), which shows (62). 



We show that (p33| is true by similar calculations as for showing (62). 

-2 1 2 

/i m n 



S2= Yl H^vM^v < CpL) - 
l<d{v)<L 



■$, 



n 



where 



$, 



n 



H ^ Inn 



E 



{v)=L 



/i~^ In^n 



First, as before we let Vi with d{vi) = Z be a given vertex so that there 
are at most tP~^ choices of v at depth j with ancestor f j. Recall the notation 
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of Zj^i^y in (106), i.e., Yj^^ = Yi^^- + Zj^i^^. By similar calculations as in (107) 



(glancing at the calculations in (94)) we obtain 



EJ Yl H^vl[l<CpL) 
J<d{v)<L 



Fi + F2, 



where 



Fi :=E 



v^ Tn 



^ m + 1 

l<d{v)<L 



g-y,,„^-Z,_,,„g-=i±i(lnm-lnc-Yi,,^-Z,_,,,). 



/{F/,^^ + Zj_i^^ < Inm - Inc} 



F,:=E Y. 



m 



J<d{v)<L 



m + 1 



-e ''."i 



-'■'I{Yi^v, + Zj_i^^ > Inm - Inc} 



^i 



Then by similar calculations as in (|96|) 

m + 1 ^ 



Fi = e 



E -"'' n ^r,.. + 0(1) = /i-' + 0(1). (108) 



j=l r=l 



By similar calculations as in (96)-(98), we obtain 

b' I 



u I / 



lnm-lnc-y,^„. ^^ L 



m 



- Y V-'e-'dV{Z,.,^,<t)\ +0(1 



i=«+i 



fe' « 



a2-;x2 



5^J]iy,,,, (L-/-/i-Mnm + /i-Mnc-^-^-/i-^5^1nW^,,„J + o(l) 



=1 r=l 



r=l 



L — l — n ^\nm + n ^Inc 



a'-fi^ 



b' I 



2/^2 



(109) 



i=l r=l r=l 
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Thus, by applying the approximations of Fi in (108) and F2 in (109) we get 



l<d(v)<L 

i=l r=l r=l 

Let Vi be a vertex at depth / and let w be a vertex at depth L. Similarly 



as in (100) and (101) (compare with [TOl equations (78)-(79)]), we get that 



n 






/i Mnn /i-^ln^ 



ra 



1=1 



/^' 



-Mn^ 



n 



In n^ 



'1111 



(63). 



From (110) and (111) we obtain that Var(E(5'2|^z) ) is o(l), which shows 



For (64) we proceed with the same method as for showing (62) and (63). 



Recall from (61) that 



l<d{v)<L 

By similar calculations as in (102) and (|103) we get 



e( Y. Vargl£<c]|^,.)|^,)=/i + /2 + o(l), 



l<d{v)<L 



where 



/, : = e-^c.™-...) y- y- ^.i.a'iz,^,, < t), 



i=l j=l+l 



V ^m 



h: = Y^{Yl -^ n K^. n KJi"^^,^^ + Zj-i^^ > Inm - Inc} 

j=l j=l+l r=l r=l+l 



%. 



Using integration by parts we calculate (similarly as in (104) and (105) 

i=l r=l 



Thus, Var('E(53 ^/)) is o(l), which shows (64) 



n 
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Proof of Lemma \2.8[ Recall from ([59|) that 



Si= Yl p(£>a;|^L). 

l<d{v)<L 



For showing (p5j) we first note that 

Var( Yl P{^v>x\^l)\^i^ 

l<d(v)<L 



l<d{v)<L, 
l<d{w)<L 



To estimate these conditional covariances we can suppose that the closest 
ancestor u for v with d{y) < L, and w with d{w) < L is at depth d> I, since 
the other terms are just because of independence. For d> I, we use 



< E(pg > x|^i)P(£ > x\^l)\^ 
which implies 

E(Cov(pg > x\^l),P{C > 2;|^l) 

< E(p(f, > x\^L)P{€n > x\<^l)). (112) 

Denote by (f^, Wu) a general pair of vertices with closest ancestor u. Then 



< 



(112) implies that 

L 

EE E e(p(L >a:|^L)Pfe„>a:|^L)). (113) 

d=l u:d{u)=d{vu,Wu) 

Recall that %+i is the a-field generated by {rz^ : d{v) < d + 1}. For the 
pair {vu-iWu) with d{u) = d, conditioned on %+i, E (P(.Cfi, > x\^l) \ %+i 

and E fP(^^^ > x\'^l) I ^d+i ) are independent. Thus, 



3'd+l 






%+i). 



(114) 
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Let v,w : V Aw = u denote that the vertices v, w have closest ancestor u. 



Using (68) and (69), by similar calculations as in (87) for v, w and u with 
d{v) 



j, d{w) = k and d{u 
to V and w), we get 



d respectively (where u is the closest ancestor 



E E E ^iP{^v>x\^L)P{€.>x\^L) 



d=l u:d(u)=d i^,weTu: 

V/\'W^U 



u+1 



:ii5) 



< 



Z-^ Z-^ Z-^ Z-^ \fji 



1,^ m^^,^lW^.,^ 



=/ u:d(u)=d j=d «eT„ 
d{v)=j 



X 



^d+1 ■ 



Z-^ Z-^ Z-^ Z-^ \ rn 

d=l u:d(u)=d k=d ™ST„: 

d(itj) — fc 



m X 



%. 



d+l 



(116) 



where the inequahty follows by applying (114) and using analogous calcu- 



lations as in [ini equations (83)-(84)]. Note that the expected value of the 



left hand-side of the inequality in (115) is equal to the right hand-side of the 



inequality in ( 113 ). Let u be the closest ancestor vertex of v and w. Let u„ be 
the child of u that is an ancestor of f , respectively u^ be the child of u that 
is an ancestor of w. Let VV„,i, be the component in the split vector of vertex 
u that corresponds to the child u^ of m, and use the analogous notation for 
YJu.w For a triple (w,w,m) with d{y) = j, d{w) = k and d{u) = d we have 



n,, = n 



n,„ = n 



n Wr,. = nWu,. n ^r,u n Wr,., 
r=l r=l r=d+2 

k d k 



(117) 



r=l 



r=l r=d+2 



For given d{u) = d > I, d{v) = j and d{w) = k, there are at most 
b'^ choices of u, and then at most b^'"^ choices of v and b''~'^ choices of w. 
(We can assume that j > d + 1 and k > d + 1, since it is easy to see that 
the other terms are few and the sum of them is small.) For the child u^ 

of u, d{uy) = d + 1 and we have ld+i,w„ = - Y,r=i ^^ ^r,u - In W„,j, (and 
for the child Uw of u, Yd+i^u^ is defined in analogy). Recall the definition 
~ y^d+i,u^ in (106). For the vertex v with d{v) = j we have 



of Zj-Lv 



■d—l,v 



jr=d+2 



In Wr^^ (and the analogous notation for Z, 



k-d-l,v 



that Zj_ 

(For simplicity we skip the vertex index in the calculations below.) Thus, by 



40 



similar calculations as in (88) and (93) the sum in (115) is equal to 

L 



m' 







V d=l d(u)=d j=d+2 
^ rlnm-lnx-Ya+i^^ . \ / ^ i 

k=d+2 ^^ / \ d=l d{u)=d " j=l 

Since E{Wf) < t the expected value of this is o(l), and thus the right 



i>'=-''-'p{z,-,-,<t)) =oiY,Y.^.Il'^^ 



which shows (65). We proceed 



hand-side of the inequality in (113) is o(l). Hence, E( Var(5'i|^i) ) is o(l 



S2= Yl E(fa£<c]|^L) 

l<d{v)<L 



Dy showing (|66|). Recall from (60) that 
yU^^ In^ n 



■$, 



n 



where 



$, 



n 



/x~i In n 



E 



/i^^ In^n 



d{v)=L 

First we consider 

Var( Yl Egl£<c]|^L) 

l<d{v)<L 

Y Cov(e(£i[£ < c]|^l),e(£,i[£, < cpL) 



%. 



118) 



l<dlw)<L 



As we argued for showing (65), we can suppose that the closest ancestor u 



for V and w is at depth d>l. Similar to (112) 



< E(E(fa[f. < c]|^L)E(e.i£ < cpi)). 

For a vertex v with (i(f ) = j, 



E{ll[l<cpL) 



TTlTiy m-\-l 1 /mrH^>, Ti^ 



?2(?71 + 1) 



e "» 



W(^) 



< 



n 



Hw, 



r=l 
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Denote by {vu,Wu) a pair of vertices with closest ancestor u as in (113) 



Consider one such pair {vu,Wu), and let d{u) = d, d{v) = j and d(w) = k. 
Since E(iy^^^) < p^ for some 5 > it follows that 



-(i-rf)-(fc-d) 



r=l 



(119) 

where Ci is a constant depending on 'Ei{Wu,vWu,w) , where Wu,v and Wu,w are 
the random variables that we introduced for ( |117 ). Thus, by using (118)- 
(119), and as in (115) letting v,w : v Aw = u denote that the vertices v,w 



have closest ancestor u we get 



ErVar( Yl H^vlil < cpL)\'^i)) 

l<d{v)<L ^ 

L 

ZEE ^{^{lAl < CpL)Hi^l[[.. < Cpl) 



d=l u:d{u)=d ^.w^Tu- 



<CiY^ b-^'^ J2 V-^-^'-'^^ Y^ i,k-d-{k-d) < C2L%-^^ -^ 0, (120) 



k=d 



where C2 is a constant. (Compare with the calculations in [TOl equation 
(87)].) We now show that 



E(Var( ^"^"'" .$, 



%]\^Q. 



:i2ii 



n 
To show this, it is enough to show that 

, L L 

^ v:d{v)=Lr=l r=l 



%]] ^Q. 



Using (120), we obtain for each s < L 

L 



^ v:d{v)=Lr=l ■' 
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Thus, the conditional Holder inequality, see e.g., [9], p. 476], yields (121) 



Recall from (61) that 



From (120) and (121) and again applying the conditional Holder inequality 



we deduce that E( Var(5'2|^/) ) is o(l), which shows (66) 



Ss= Yl Vargl[e:<c]|^i). 

l<d{v)<L 

It remains to show that E( Var(S'3|^/) ) is o(l). To show this we observe 
that 

Vargl£ < cpi) < E(e/l£ < cpi) < cE^lg < cpi), 
and thus ([67|) follows from (|119[) by similar calculations as in (120[). 
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